-
Notifications
You must be signed in to change notification settings - Fork 1.3k
CSHARP-5603: Add Big Endian support in BinaryVectorReader and BinaryVectorWriter #1682
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…<T>() Signed-off-by: Medha Tiwari <[email protected]>
5cd9ca1
to
a4384e3
Compare
Signed-off-by: Medha Tiwari <[email protected]>
a4384e3
to
2c2cae1
Compare
Hi @BorisDog, if everything if fine, can this be merged? |
Hi @BorisDog, just following up to check if there's any update on this PR. Please let me know if any further changes are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review is pending on requested changes.
…or float32 on all platforms Signed-off-by: Medha Tiwari <[email protected]>
…ation Signed-off-by: Medha Tiwari <[email protected]>
…ndling Signed-off-by: Medha Tiwari <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests fail on net472.
tests/MongoDB.Bson.Tests/Serialization/Serializers/BinaryVectorSerializerTests.cs
Outdated
Show resolved
Hide resolved
tests/MongoDB.Bson.Tests/Serialization/Serializers/BinaryVectorSerializerTests.cs
Outdated
Show resolved
Hide resolved
tests/MongoDB.Bson.Tests/Serialization/Serializers/BinaryVectorSerializerTests.cs
Outdated
Show resolved
Hide resolved
Signed-off-by: Medha Tiwari <[email protected]>
Signed-off-by: Medha Tiwari <[email protected]>
ee0aa0a
to
530ecda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Tests are passing as well.
Few styling comments + tests improvement.
public void ReadSingleLittleEndian_should_throw_on_insufficient_length() | ||
{ | ||
var shortBuffer = new byte[3]; | ||
Assert.Throws<ArgumentOutOfRangeException>(() => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please switch to Record.Exception
(some examples in BinaryVectorSerializerTests.cs)
tests/MongoDB.Bson.Tests/Serialization/Serializers/BinaryVectorSerializerTests.cs
Show resolved
Hide resolved
{ | ||
return MemoryMarshal.Cast<T, byte>(span).ToArray(); | ||
} | ||
int elementSize = Marshal.SizeOf<T>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use var
where possible.
throw new NotSupportedException("Binary vector data is not supported on Big Endian architecture yet."); | ||
} | ||
case BinaryVectorDataType.Float32: | ||
var length = vectorData.Length * sizeof(float); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can just have vectorData.Length * 4
.
Float32 format is defined as 32 bits , in all other places 4 is hardcoded.
resultBytes[1] = padding; | ||
|
||
var floatSpan = MemoryMarshal.Cast<TItem, float>(vectorData); | ||
Span<byte> floatOutput = resultBytes.AsSpan(2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: var
@@ -35,15 +36,41 @@ public static byte[] WriteToBytes<TItem>(BinaryVector<TItem> binaryVector) | |||
public static byte[] WriteToBytes<TItem>(ReadOnlySpan<TItem> vectorData, BinaryVectorDataType binaryVectorDataType, byte padding) | |||
where TItem : struct | |||
{ | |||
if (!BitConverter.IsLittleEndian) | |||
byte[] resultBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be defined in Float32
case.
Also can be simplified to result
.
#endif | ||
} | ||
|
||
// This layout trick allows safely reinterpreting float as int and vice versa. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more minor comments.
case BinaryVectorDataType.Float32: | ||
byte[] result; | ||
var length = vectorData.Length * 4; | ||
result = new byte[2 + length]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var result = new byte[2 + length];
is sufficient.
result = new float[count]; | ||
for (int i = 0; i < count; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use 4 whitespaces instead of tab.
BinaryPrimitivesCompat.ReadSingleLittleEndian(shortBuffer)); | ||
|
||
exception.Should().BeOfType<ArgumentOutOfRangeException>(); | ||
exception.Message.Should().Contain("length"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the following pattern:
var e = exception.Should().BeOfType<ArgumentOutOfRangeException>().Subject;
e.ParamName.Should().Be("length");
and in WriteSingleLittleEndian_should_throw_on_insufficient_length
as well.
Also this seems to be the reason for ReadSingleLittleEndian_should_throw_on_insufficient_length
and WriteSingleLittleEndian_should_throw_on_insufficient_length
failers on net472.
#else | ||
if (source.Length < 4) | ||
{ | ||
throw new ArgumentOutOfRangeException(nameof(source), "Source span is too small to contain a float."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nameof(source.Length)
?
Signed-off-by: Medha Tiwari <[email protected]>
4078bfa
to
c547a15
Compare
Description
This PR adds Big Endian support for System.Single (Float32) to the BinaryVectorWriter.WriteToBytes() method.
Background
While running the MongoDB.Bson.Tests test suite on a Big Endian (s390x) system, we encountered 34 consistent test failures within the BinaryVectorSerializerTests class.
Each failure was caused by a System.NotSupportedException indicating that binary vector data of float32 type is not yet supported on Big Endian architectures.
Exception Observed
Sample Failing Tests
Some of the test cases that failed due to this limitation include:
BinaryVectorSerializerTests.BinaryVectorSerializer_should_deserialize_bson_vector<Float32>
BinaryVectorSerializerTests.BinaryVectorSerializer_should_serialize_bson_vector<Float32>
BinaryVectorSerializerTests.ArrayAsBinaryVectorSerializer_should_deserialize_bson_vector<Float32>
BinaryVectorSerializerTests.ArrayAsBinaryVectorSerializer_should_serialize_bson_vector<Float32>
BinaryVectorSerializerTests.MemoryAsBinaryVectorSerializer_should_serialize_bson_vector<Float32>
BinaryVectorSerializerTests.MemoryAsBinaryVectorSerializer_should_deserialize_bson_vector<Float32>
BinaryVectorSerializerTests.ReadOnlyMemoryAsBinaryVectorSerializer_should_serialize_bson_vector<Float32>
BinaryVectorSerializerTests.ReadOnlyMemoryAsBinaryVectorSerializer_should_deserialize_bson_vector<Float32>
Why This Fix Is Necessary
This limitation was blocking test pass status on Big Endian platforms such as s390x. Adding support for float32 serialization in Big Endian format:
Enables consistent behavior across architectures
Completes existing deserialization support added earlier in BinaryVectorReader.cs
Changes Introduced
Added Big Endian branch to BinaryVectorWriter.WriteToBytes() for T == float.
Used
BinaryPrimitives.WriteSingleBigEndian()
to write bytes in the correct order.Left existing Little Endian logic untouched to preserve behavior.
cc: @giritrivedi